Overview

Dataset statistics

Number of variables15
Number of observations6238103
Missing cells0
Missing cells (%)0.0%
Duplicate rows2678
Duplicate rows (%)< 0.1%
Total size in memory713.9 MiB
Average record size in memory120.0 B

Variable types

Categorical7
Numeric8

Alerts

Dataset has 2678 (< 0.1%) duplicate rowsDuplicates
ECO has a high cardinality: 493 distinct values High cardinality
Opening has a high cardinality: 2941 distinct values High cardinality
Result is highly correlated with WhiteRatingDiff and 1 other fieldsHigh correlation
WhiteElo is highly correlated with BlackEloHigh correlation
BlackElo is highly correlated with WhiteEloHigh correlation
WhiteRatingDiff is highly correlated with Result and 1 other fieldsHigh correlation
BlackRatingDiff is highly correlated with Result and 1 other fieldsHigh correlation
TimeControl is highly correlated with TimeControl_encHigh correlation
TimeControl_enc is highly correlated with TimeControlHigh correlation
WhiteElo is highly correlated with BlackEloHigh correlation
BlackElo is highly correlated with WhiteEloHigh correlation
TimeControl is highly correlated with TimeControl_encHigh correlation
TimeControl_enc is highly correlated with TimeControlHigh correlation
Result is highly correlated with WhiteRatingDiff and 1 other fieldsHigh correlation
WhiteElo is highly correlated with BlackEloHigh correlation
BlackElo is highly correlated with WhiteEloHigh correlation
WhiteRatingDiff is highly correlated with Result and 1 other fieldsHigh correlation
BlackRatingDiff is highly correlated with Result and 1 other fieldsHigh correlation
TimeControl is highly correlated with TimeControl_encHigh correlation
TimeControl_enc is highly correlated with TimeControlHigh correlation
Termination is highly correlated with Termination_encHigh correlation
Event_enc is highly correlated with EventHigh correlation
Event is highly correlated with Event_encHigh correlation
Termination_enc is highly correlated with TerminationHigh correlation
Event is highly correlated with Event_enc and 1 other fieldsHigh correlation
Result is highly correlated with BlackRatingDiffHigh correlation
WhiteElo is highly correlated with BlackEloHigh correlation
BlackElo is highly correlated with WhiteEloHigh correlation
BlackRatingDiff is highly correlated with ResultHigh correlation
TimeControl is highly correlated with TimeControl_encHigh correlation
Termination is highly correlated with Termination_encHigh correlation
Event_enc is highly correlated with Event and 1 other fieldsHigh correlation
Termination_enc is highly correlated with TerminationHigh correlation
TimeControl_enc is highly correlated with Event and 2 other fieldsHigh correlation
TimeControl has 105447 (1.7%) zeros Zeros
ECO_enc has 421912 (6.8%) zeros Zeros
TimeControl_enc has 105447 (1.7%) zeros Zeros

Reproduction

Analysis started2022-01-06 13:11:37.807644
Analysis finished2022-01-06 13:16:28.230210
Duration4 minutes and 50.42 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Event
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
Blitz
2804020 
Bullet
1740741 
Classical
1671138 
Correspondence
 
22204

Length

Max length14
Median length6
Mean length6.382652547
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowClassical
2nd rowBlitz
3rd rowBlitz
4th rowCorrespondence
5th rowBlitz

Common Values

ValueCountFrequency (%)
Blitz2804020
44.9%
Bullet1740741
27.9%
Classical1671138
26.8%
Correspondence22204
 
0.4%

Length

2022-01-06T14:16:28.259936image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T14:16:28.287285image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
blitz2804020
44.9%
bullet1740741
27.9%
classical1671138
26.8%
correspondence22204
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Result
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
1
3098312 
2
2901028 
3
 
238763

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
13098312
49.7%
22901028
46.5%
3238763
 
3.8%

Length

2022-01-06T14:16:28.316576image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T14:16:28.340070image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
13098312
49.7%
22901028
46.5%
3238763
 
3.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

WhiteElo
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2173
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1741.87948
Minimum737
Maximum3110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.6 MiB
2022-01-06T14:16:28.372639image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum737
5-th percentile1309
Q11559
median1740
Q31919
95-th percentile2183
Maximum3110
Range2373
Interquartile range (IQR)360

Descriptive statistics

Standard deviation265.7517445
Coefficient of variation (CV)0.1525660918
Kurtosis0.01723628463
Mean1741.87948
Median Absolute Deviation (MAD)180
Skewness0.08174090944
Sum1.086602361 × 1010
Variance70623.98973
MonotonicityNot monotonic
2022-01-06T14:16:28.419002image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150043037
 
0.7%
17659550
 
0.2%
17549515
 
0.2%
17179498
 
0.2%
17489491
 
0.2%
17539468
 
0.2%
17029450
 
0.2%
17519446
 
0.2%
17459443
 
0.2%
17529437
 
0.2%
Other values (2163)6109768
97.9%
ValueCountFrequency (%)
7371
< 0.1%
7461
< 0.1%
7571
< 0.1%
7611
< 0.1%
7652
< 0.1%
7731
< 0.1%
7742
< 0.1%
7751
< 0.1%
7761
< 0.1%
7771
< 0.1%
ValueCountFrequency (%)
31101
< 0.1%
31061
< 0.1%
31021
< 0.1%
30971
< 0.1%
30921
< 0.1%
30871
< 0.1%
30821
< 0.1%
30781
< 0.1%
30761
< 0.1%
30721
< 0.1%

BlackElo
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2180
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1740.487152
Minimum728
Maximum3108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.6 MiB
2022-01-06T14:16:28.611003image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum728
5-th percentile1306
Q11557
median1739
Q31919
95-th percentile2184
Maximum3108
Range2380
Interquartile range (IQR)362

Descriptive statistics

Standard deviation266.9153589
Coefficient of variation (CV)0.1533566959
Kurtosis0.005050847269
Mean1740.487152
Median Absolute Deviation (MAD)181
Skewness0.08218531088
Sum1.085733813 × 1010
Variance71243.80881
MonotonicityNot monotonic
2022-01-06T14:16:28.657066image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150031607
 
0.5%
17299659
 
0.2%
17069513
 
0.2%
17049508
 
0.2%
17409506
 
0.2%
17369474
 
0.2%
16949414
 
0.2%
17959406
 
0.2%
17509401
 
0.2%
17079398
 
0.2%
Other values (2170)6121217
98.1%
ValueCountFrequency (%)
7281
< 0.1%
7312
< 0.1%
7381
< 0.1%
7521
< 0.1%
7581
< 0.1%
7601
< 0.1%
7622
< 0.1%
7681
< 0.1%
7712
< 0.1%
7721
< 0.1%
ValueCountFrequency (%)
31081
< 0.1%
31041
< 0.1%
30991
< 0.1%
30951
< 0.1%
30901
< 0.1%
30851
< 0.1%
30801
< 0.1%
30791
< 0.1%
30751
< 0.1%
30731
< 0.1%

WhiteRatingDiff
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION

Distinct1174
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5453305276
Minimum-595
Maximum673
Zeros55057
Zeros (%)0.9%
Negative2992385
Negative (%)48.0%
Memory size47.6 MiB
2022-01-06T14:16:28.702723image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum-595
5-th percentile-17
Q1-9
median1
Q310
95-th percentile17
Maximum673
Range1268
Interquartile range (IQR)19

Descriptive statistics

Standard deviation22.80828203
Coefficient of variation (CV)41.82469324
Kurtosis117.8864107
Mean0.5453305276
Median Absolute Deviation (MAD)9
Skewness2.949924809
Sum3401828
Variance520.2177293
MonotonicityNot monotonic
2022-01-06T14:16:28.749226image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10278603
 
4.5%
9272150
 
4.4%
-10263882
 
4.2%
11262887
 
4.2%
-9259391
 
4.2%
8250801
 
4.0%
-11245938
 
3.9%
-8238607
 
3.8%
7223981
 
3.6%
12221290
 
3.5%
Other values (1164)3720573
59.6%
ValueCountFrequency (%)
-5951
< 0.1%
-5931
< 0.1%
-5841
< 0.1%
-5721
< 0.1%
-5691
< 0.1%
-5681
< 0.1%
-5631
< 0.1%
-5611
< 0.1%
-5552
< 0.1%
-5492
< 0.1%
ValueCountFrequency (%)
6731
< 0.1%
6691
< 0.1%
6681
< 0.1%
6671
< 0.1%
6621
< 0.1%
6611
< 0.1%
6601
< 0.1%
6591
< 0.1%
6571
< 0.1%
6552
< 0.1%

BlackRatingDiff
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1161
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.3338538976
Minimum-653
Maximum664
Zeros54741
Zeros (%)0.9%
Negative3192407
Negative (%)51.2%
Memory size47.6 MiB
2022-01-06T14:16:28.797223image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum-653
5-th percentile-17
Q1-10
median-1
Q39
95-th percentile17
Maximum664
Range1317
Interquartile range (IQR)19

Descriptive statistics

Standard deviation21.79724149
Coefficient of variation (CV)-65.28976191
Kurtosis107.6292509
Mean-0.3338538976
Median Absolute Deviation (MAD)9
Skewness1.749166484
Sum-2082615
Variance475.1197364
MonotonicityNot monotonic
2022-01-06T14:16:28.844634image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-10284295
 
4.6%
-9277433
 
4.4%
-11263542
 
4.2%
10259357
 
4.2%
-8255150
 
4.1%
9254472
 
4.1%
11242972
 
3.9%
8234358
 
3.8%
-7224955
 
3.6%
-12221116
 
3.5%
Other values (1151)3720453
59.6%
ValueCountFrequency (%)
-6531
< 0.1%
-5961
< 0.1%
-5931
< 0.1%
-5811
< 0.1%
-5771
< 0.1%
-5751
< 0.1%
-5731
< 0.1%
-5701
< 0.1%
-5651
< 0.1%
-5621
< 0.1%
ValueCountFrequency (%)
6641
< 0.1%
6612
< 0.1%
6581
< 0.1%
6541
< 0.1%
6531
< 0.1%
6521
< 0.1%
6511
< 0.1%
6471
< 0.1%
6461
< 0.1%
6451
< 0.1%

ECO
Categorical

HIGH CARDINALITY

Distinct493
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
A00
 
421912
C00
 
289504
A40
 
287086
B01
 
286469
D00
 
229048
Other values (488)
4724084 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowD10
2nd rowC20
3rd rowB01
4th rowA00
5th rowB90

Common Values

ValueCountFrequency (%)
A00421912
 
6.8%
C00289504
 
4.6%
A40287086
 
4.6%
B01286469
 
4.6%
D00229048
 
3.7%
B00203902
 
3.3%
C41195353
 
3.1%
C20178402
 
2.9%
B20150030
 
2.4%
D02128764
 
2.1%
Other values (483)3867633
62.0%

Length

2022-01-06T14:16:28.887791image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a00421912
 
6.8%
c00289504
 
4.6%
a40287086
 
4.6%
b01286469
 
4.6%
d00229048
 
3.7%
b00203902
 
3.3%
c41195353
 
3.1%
c20178402
 
2.9%
b20150030
 
2.4%
d02128764
 
2.1%
Other values (483)3867633
62.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Opening
Categorical

HIGH CARDINALITY

Distinct2941
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
Van't Kruijs Opening
 
132716
Scandinavian Defense: Mieses-Kotroc Variation
 
112173
Modern Defense
 
108084
Horwitz Defense
 
95413
Sicilian Defense
 
85585
Other values (2936)
5704132 

Length

Max length90
Median length32
Mean length30.92826906
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique83 ?
Unique (%)< 0.1%

Sample

1st rowSlav Defense
2nd rowKing's Pawn Opening: 2.b3
3rd rowScandinavian Defense: Mieses-Kotroc Variation
4th rowVan't Kruijs Opening
5th rowSicilian Defense: Najdorf, Lipnitsky Attack

Common Values

ValueCountFrequency (%)
Van't Kruijs Opening132716
 
2.1%
Scandinavian Defense: Mieses-Kotroc Variation112173
 
1.8%
Modern Defense108084
 
1.7%
Horwitz Defense95413
 
1.5%
Sicilian Defense85585
 
1.4%
French Defense: Knight Variation83478
 
1.3%
Caro-Kann Defense82371
 
1.3%
Scandinavian Defense78439
 
1.3%
Owen Defense73417
 
1.2%
Sicilian Defense: Bowdler Attack72418
 
1.2%
Other values (2931)5314009
85.2%

Length

2022-01-06T14:16:28.934453image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
defense3739310
 
15.1%
variation2279588
 
9.2%
game1240577
 
5.0%
opening1071318
 
4.3%
gambit903467
 
3.7%
sicilian827180
 
3.3%
queen's782636
 
3.2%
pawn689532
 
2.8%
attack677328
 
2.7%
king's581262
 
2.3%
Other values (1351)11952293
48.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TimeControl
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean312.604969
Minimum0
Maximum10800
Zeros105447
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size47.6 MiB
2022-01-06T14:16:28.982717image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q1120
median240
Q3300
95-th percentile900
Maximum10800
Range10800
Interquartile range (IQR)180

Descriptive statistics

Standard deviation403.0660893
Coefficient of variation (CV)1.289378382
Kurtosis269.9052787
Mean312.604969
Median Absolute Deviation (MAD)120
Skewness11.87483035
Sum1950061995
Variance162462.2723
MonotonicityNot monotonic
2022-01-06T14:16:29.023385image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
3001508104
24.2%
1801219187
19.5%
601033991
16.6%
600718324
11.5%
120330836
 
5.3%
30325001
 
5.2%
900235824
 
3.8%
480141611
 
2.3%
240122893
 
2.0%
420120834
 
1.9%
Other values (24)481498
 
7.7%
ValueCountFrequency (%)
0105447
 
1.7%
30325001
 
5.2%
4527339
 
0.4%
601033991
16.6%
9032323
 
0.5%
120330836
 
5.3%
1801219187
19.5%
240122893
 
2.0%
3001508104
24.2%
36074591
 
1.2%
ValueCountFrequency (%)
108003412
 
0.1%
9000141
 
< 0.1%
7200484
 
< 0.1%
5400833
 
< 0.1%
36005797
 
0.1%
27004764
 
0.1%
24001851
 
< 0.1%
21001588
 
< 0.1%
180038385
0.6%
150020990
0.3%

Termination
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
Normal
4227924 
Time forfeit
2010179 

Length

Max length12
Median length6
Mean length7.933452205
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTime forfeit
2nd rowNormal
3rd rowTime forfeit
4th rowNormal
5th rowTime forfeit

Common Values

ValueCountFrequency (%)
Normal4227924
67.8%
Time forfeit2010179
32.2%

Length

2022-01-06T14:16:29.065068image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T14:16:29.092426image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
normal4227924
51.3%
time2010179
24.4%
forfeit2010179
24.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Event_enc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
0
2804020 
1
1740741 
2
1671138 
3
 
22204

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row0
3rd row0
4th row3
5th row0

Common Values

ValueCountFrequency (%)
02804020
44.9%
11740741
27.9%
21671138
26.8%
322204
 
0.4%

Length

2022-01-06T14:16:29.117481image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T14:16:29.141448image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
02804020
44.9%
11740741
27.9%
21671138
26.8%
322204
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ECO_enc
Real number (ℝ≥0)

ZEROS

Distinct493
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.0891624
Minimum0
Maximum492
Zeros421912
Zeros (%)6.8%
Negative0
Negative (%)0.0%
Memory size47.6 MiB
2022-01-06T14:16:29.174532image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q157
median135
Q3242
95-th percentile330
Maximum492
Range492
Interquartile range (IQR)185

Descriptive statistics

Standard deviation110.5153076
Coefficient of variation (CV)0.6903359728
Kurtosis-0.6998932807
Mean160.0891624
Median Absolute Deviation (MAD)95
Skewness0.2578634316
Sum998652684
Variance12213.63322
MonotonicityNot monotonic
2022-01-06T14:16:29.225973image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0421912
 
6.8%
200289504
 
4.6%
40287086
 
4.6%
101286469
 
4.6%
300229048
 
3.7%
100203902
 
3.3%
241195353
 
3.1%
220178402
 
2.9%
120150030
 
2.4%
302128764
 
2.1%
Other values (483)3867633
62.0%
ValueCountFrequency (%)
0421912
6.8%
190015
 
1.4%
236762
 
0.6%
329203
 
0.5%
4109224
 
1.8%
55279
 
0.1%
639348
 
0.6%
712935
 
0.2%
810507
 
0.2%
97306
 
0.1%
ValueCountFrequency (%)
492245
 
< 0.1%
491389
 
< 0.1%
4901060
 
< 0.1%
4891
 
< 0.1%
48874
 
< 0.1%
4871567
 
< 0.1%
486117
 
< 0.1%
4851558
 
< 0.1%
4843804
0.1%
4838232
0.1%

Termination_enc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.6 MiB
0
4227924 
1
2010179 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
04227924
67.8%
12010179
32.2%

Length

2022-01-06T14:16:29.271404image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T14:16:29.294333image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
04227924
67.8%
12010179
32.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TimeControl_enc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.635421377
Minimum0
Maximum33
Zeros105447
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size47.6 MiB
2022-01-06T14:16:29.320552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median7
Q38
95-th percentile18
Maximum33
Range33
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.874170924
Coefficient of variation (CV)0.638363056
Kurtosis2.422290136
Mean7.635421377
Median Absolute Deviation (MAD)2
Skewness1.304240679
Sum47630545
Variance23.7575422
MonotonicityNot monotonic
2022-01-06T14:16:29.361326image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
81508104
24.2%
61219187
19.5%
31033991
16.6%
13718324
11.5%
5330836
 
5.3%
1325001
 
5.2%
18235824
 
3.8%
11141611
 
2.3%
7122893
 
2.0%
10120834
 
1.9%
Other values (24)481498
 
7.7%
ValueCountFrequency (%)
0105447
 
1.7%
1325001
 
5.2%
227339
 
0.4%
31033991
16.6%
432323
 
0.5%
5330836
 
5.3%
61219187
19.5%
7122893
 
2.0%
81508104
24.2%
974591
 
1.2%
ValueCountFrequency (%)
333412
 
0.1%
32141
 
< 0.1%
31484
 
< 0.1%
30833
 
< 0.1%
295797
 
0.1%
284764
 
0.1%
271851
 
< 0.1%
261588
 
< 0.1%
2538385
0.6%
2420990
0.3%

EloDiff
Real number (ℝ)

Distinct2610
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.392327443
Minimum-1691
Maximum1702
Zeros19988
Zeros (%)0.3%
Negative3089314
Negative (%)49.5%
Memory size47.6 MiB
2022-01-06T14:16:29.407240image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum-1691
5-th percentile-332
Q1-104
median1
Q3107
95-th percentile336
Maximum1702
Range3393
Interquartile range (IQR)211

Descriptive statistics

Standard deviation202.34563
Coefficient of variation (CV)145.3290539
Kurtosis2.040020487
Mean1.392327443
Median Absolute Deviation (MAD)105
Skewness0.001761995209
Sum8685482
Variance40943.75399
MonotonicityNot monotonic
2022-01-06T14:16:29.451368image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019988
 
0.3%
-117885
 
0.3%
117854
 
0.3%
-217765
 
0.3%
517760
 
0.3%
-517754
 
0.3%
-317720
 
0.3%
417718
 
0.3%
-417703
 
0.3%
-617701
 
0.3%
Other values (2600)6058255
97.1%
ValueCountFrequency (%)
-16911
< 0.1%
-16081
< 0.1%
-15691
< 0.1%
-15281
< 0.1%
-14831
< 0.1%
-14811
< 0.1%
-14791
< 0.1%
-14781
< 0.1%
-14692
< 0.1%
-14681
< 0.1%
ValueCountFrequency (%)
17021
< 0.1%
15781
< 0.1%
15541
< 0.1%
15071
< 0.1%
14941
< 0.1%
14801
< 0.1%
14791
< 0.1%
14751
< 0.1%
14741
< 0.1%
14691
< 0.1%

Interactions

2022-01-06T14:16:09.062389image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:24.636539image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:31.156279image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:37.573624image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:44.249102image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:49.980244image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:56.296335image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:02.708326image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:09.868823image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:25.436778image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:31.931691image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:38.394729image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:44.956493image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:50.766031image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:57.086542image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:03.490751image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:10.717958image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:26.249381image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:32.742692image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:39.139831image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:45.648902image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:51.551097image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:57.896518image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:04.293230image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:11.555727image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:27.069092image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:33.565467image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:39.972486image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:46.333563image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:52.372699image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:58.730138image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:05.112334image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:12.363554image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:27.892465image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:34.371031image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:40.788660image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:47.048469image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:53.168363image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:59.546362image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:05.885938image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:13.149259image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:28.703582image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:35.167959image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:41.881766image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:47.778791image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:53.952642image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:00.324310image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:06.681267image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:13.938511image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:29.521161image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:35.964094image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:42.703486image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:48.497300image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:54.737127image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:01.126209image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:07.466758image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:14.760366image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:30.356502image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:36.759220image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:43.521152image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:49.211389image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:15:55.510077image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:01.907966image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2022-01-06T14:16:08.267770image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2022-01-06T14:16:29.494656image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-06T14:16:29.557361image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-06T14:16:29.620123image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-06T14:16:29.678457image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-06T14:16:29.725631image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-06T14:16:15.467541image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-06T14:16:20.601972image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

EventResultWhiteEloBlackEloWhiteRatingDiffBlackRatingDiffECOOpeningTimeControlTerminationEvent_encECO_encTermination_encTimeControl_encEloDiff
0Classical11901189611.0-11.0D10Slav Defense300Time forfeit2310185
1Blitz216411627-11.012.0C20King's Pawn Opening: 2.b3300Normal02200814
2Blitz11647168813.0-13.0B01Scandinavian Defense: Mieses-Kotroc Variation180Time forfeit010116-41
3Correspondence11706131727.0-25.0A00Van't Kruijs Opening0Normal3000389
4Blitz219451900-14.013.0B90Sicilian Defense: Najdorf, Lipnitsky Attack180Time forfeit01901645
5Blitz217731809-10.010.0C27Vienna Game180Normal022706-36
6Blitz218951886-12.012.0B10Caro-Kann Defense: Two Knights Attack180Time forfeit0110169
7Blitz12155235620.0-20.0D02Queen's Pawn Game: London System180Normal030206-201
8Blitz220102111-9.09.0A45Indian Game300Normal04508-101
9Blitz11764177312.0-12.0B01Scandinavian Defense: Mieses-Kotroc Variation180Time forfeit010116-9

Last rows

EventResultWhiteEloBlackEloWhiteRatingDiffBlackRatingDiffECOOpeningTimeControlTerminationEvent_encECO_encTermination_encTimeControl_encEloDiff
6238093Blitz3158916151.0-1.0C45Scotch Game420Normal0245010-26
6238094Classical1174413372.0-3.0A40Modern Defense900Normal240018407
6238095Bullet11802196516.0-16.0A00Hungarian Opening: Reversed Modern Defense0Time forfeit1010-163
6238096Bullet11593191219.0-24.0B25Sicilian Defense: Closed Variation, Traditional60Normal112503-319
6238097Blitz214911509-11.010.0C20King's Pawn Game: Napoleon Attack120Normal022005-18
6238098Blitz11248130313.0-17.0B54Sicilian Defense180Normal015406-55
6238099Classical11328129210.0-11.0C40King's Knight Opening1800Normal224002536
6238100Bullet216601658-11.011.0B21Sicilian Defense: Smith-Morra Gambit120Normal1121052
6238101Bullet217261776-8.09.0A09Reti Opening: Reti Accepted60Normal1903-50
6238102Classical219481992-74.08.0C00Rat Defense: Small Center Defense480Normal2200011-44

Duplicate rows

Most frequently occurring

EventResultWhiteEloBlackEloWhiteRatingDiffBlackRatingDiffECOOpeningTimeControlTerminationEvent_encECO_encTermination_encTimeControl_encEloDiff# duplicates
2494Correspondence115001500162.0-163.0A00Van't Kruijs Opening0Normal3000013
2589Correspondence215001500-163.0162.0A00Grob Opening0Normal300009
2665Correspondence3150015000.00.0A00Kadas Opening0Normal300009
2667Correspondence3150015000.00.0A00Van't Kruijs Opening0Normal300009
2590Correspondence215001500-163.0162.0A00Hungarian Opening0Normal300008
2594Correspondence215001500-163.0162.0A00Van't Kruijs Opening0Normal300008
2616Correspondence215001500-163.0162.0C20King's Pawn Game: Leonardis Variation0Normal32200008
2637Correspondence215001500-157.0167.0A00Van't Kruijs Opening0Normal300008
2602Correspondence215001500-163.0162.0B01Scandinavian Defense0Normal31010007
2622Correspondence215001500-163.0162.0C44King's Knight Opening: Normal Variation0Normal32440007